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Abstract 

A deterministic multiscale toy model is studied in which a chaotic fast subsystem 
triggers rare transitions between slow regimes, akin to weather or climate regimes. Us- 
ing homogenization techniques, a reduced stochastic parametrization model is derived 
for the slow dynamics. The reliability of this reduced climate model in reproducing 
the statistics of the slow dynamics of the full deterministic model for finite values of 
the time scale separation is numerically established. The statistics however is sensitive 
to uncertainties in the parameters of the stochastic model. 

It is investigated whether the stochastic climate model can be beneficial as a forecast 
model in an ensemble data assimilation setting, in particular in the realistic setting 
when observations are only available for the slow variables. The main result is that 
reduced stochastic models can indeed improve the analysis skill, when used as forecast 
models instead of the perfect full deterministic model. The stochastic climate model 
is far superior at detecting transitions between regimes. The observation intervals for 
which skill improvement can be obtained are related to the characteristic time scales 
involved. The reason why stochastic climate models are capable of producing supe- 
rior skill in an ensemble setting is due to the finite ensemble size; ensembles obtained 
from the perfect deterministic forecast model lacks sufficient spread even for moderate 
ensemble sizes. Stochastic climate models provide a natural way to provide sufficient 
ensemble spread to detect transitions between regimes. This is corroborated with nu- 
merical simulations. The conclusion is that stochastic parametrizations are attractive 
for data assimilation despite their sensitivity to uncertainties in the parameters. 
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1 Introduction 



An area of broad research in the atmospheric sciences in recent times is ho w to rnost ef- 
fecti vely parame t rize su bgrid-scale processes. Since the pioneering papers by iLeithI (Il975l ) 
and iHasselmannl (119761 ). the stochastic parametrization of processes which cannot be spa- 
tially and/or temporally resolved have recently g ained populari ty across disciplines in the 
atmospheric, oceanographic and climate sciences (jPalmerl. 2001 ) . Applications range frorn 



the st udy of atmospheric low-frequency variability (i Franzke et al. 



20051: iFranzke and Maida 



2006 ). deep-convection and cloud modelling in G CM's dLin and Neelinl . l2000U2002uPlant and Craigl . 
20081). decadal climate changes such as El Nino ( iKleemanl . |2008[ ) to paleoclimatic modelling 
( Ditlevsenl. ll999tlKwasniok and Lohmannl. 120091) . We refer the read er to the books edited by 
Imkeller and von Storchl (l200ll ) and by iPalmer and Williamsl ( 120101 ) which contain excellent 
overviews of current trends in stochastic climate modelling. 



There exists a plethora of different methods to construct stochastic subgrid-scale parametriza- 
tions, including phenomenological ap proaches such as ran domization of existing determin- 



istic parametrization s c heme s (e.R. Buizza et al. ( 19991 )). energetic backscattering 



[e.g. 



Fr.H.riW„ „nH DaviJB:ISh„t,H fan0.^1. H.t.a ^n te.knii^ such as MaAov chains 



e.g. Crommelin and Vanden-Eiindenl (|2008[)'). a nd systematic approaches using stochastic 



homogenization (e.g. iMajda et al 



"(ll999U2003h l The specific functional forms and pa- 
rameter values of the respective parametrization schemes can be heuri stically postulated on 
grounds of the particular physic s invol ved (e.g. iLin and Neelinl ( 120001 )). or estimated using 



time series analysis (e.g. Wilks ( 2005[ )). In the case of multiscale dynamics, however, the 
para meters can b e systematically d erived using averaging and homogenization techniques 
(e.g. iMajda et al. (Il999l l200l[ l2008h l Our work will be concerned with the latter approach. 
The general theory of stochastic averagi ng and homo g eniza ti on for r nultis cale dynamical sys- 
tems g oes back to the seminal papers of iKhasminskvi (1l966l ) , iKurtzl (1l973l ) and iPapanicolaou 
(1l976l ). Starting with lMajda et al.l ( 119991 ) the se mathematically rig orou s ideas have recently 



been applied in the atmospheric context by IFranzke et al.l (120051 ) and IFranzke and Majda 
(l2006h . 



We will introduce a simple deterministic multiscale toy model in which a slow degree 
of freedom with mult ip le stable states, r e sembling slow weather or climate regimes (eg. 
Legras and Ghil ( 1985 ): Crommelin ( 2003 ): Crommelin et"all ( 2004 ): Branstator and Berner 
(2005) and Ditlevsen ( 1999 ): Kwasniok and Lohmann ( 20091 )). is driven by a fast chaotic sub- 
system which also involves metastable regimes. Although simple, this model involves core 
phenomena found in realistic applications such as slow and fast metastable states as carica- 
tures of climate and weather regimes respectively, and rare transitions betwee n them. The 



model is ar nenable to the theory of hom ogenization (for an excellent review see iGivon et al. 



(l2004h and IPavhotis and StuartI (120081 )). and we will derive a reduced stochastic climate 
model describing the slow dynamics of the full higher-dimensional model. Whereas for non- 
systematic stochastic parametrizations the results are sensitive to details of the assumed 
processes, we will verify here that the homogenized reduced model faithfully represents the 
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slow dynamics of the full system even for moderate time scale separation. We have chosen a 
system amenable to homogenization precisely for the reason that the issue of the appropriate 
choice of the stochastic parametrization does not arise and the validity of the parametriza- 
tion is guaranteed by rigorous theorems (at least in the limit of large time scale separation). 



Our main focus here is how well homogenized stochastic climate models perform when used 
as forecast models in the context of ensemble data assimilation. In data assimilation one 
attempts to find an optimal estimate of the state of a system combinin g inforniation o f noisy 
observations and a numerical forecast model exhibiting model error (jKalnayl . 120021 ). This 
procedure is complicated by the nonlinear chaotic dynamics of the system as well as the 
impossibility of observing the entire system at any one time. On larger time scales, it is 
often only possible to adequately observe the slow, large scale degrees of freedom of the 
system, while the fast, small scale degrees of freedom in general remain unobservable. We 
consider one of the state-of-the art data assimilation methods called the ensemble Kalman 



filter (EnKF) (lEvensenl . Il994l . |2006| ). In such a filter one evolves an ensemble of state esti- 
mates forward in time using a full nonlinear forecast model, and then estimates the forecast 
(or background) mean and its associated error variance from the ensemble. Together with 
observations of the system, this covariance is used in a least squares minimization problem 
to find an optimal state estimate called the analysis, along with an estimate of the associated 
analysis error covariance. This filter is optimal for linear systems with Gaussian errors, as- 
sumptions which are generically not consistent with real world systems. Besides their ease of 
implementation, the attractive feature of ensemble filters is that the forecast error covariance 
is estimated using the full nonlinear forecast model. 



Ensemble based Kalman filters, however, suffer from the problem of sampling errors due 
to the insufficient ensemble size. These errors usually underestimate the error covariances 
which may ultimately lead to filter divergence, when the filter trusts the forecast and ignores 
the information given by the observations. To avoid filter divergence the concept of covari- 
ance infiation w as introduced whereby the prio r forecast error covariance is increased by an 
infiation factor ( lAnderson and Anderson! . Il999l ) . This is usually done globally and involves 
careful and computationally expensive tuning of the infiation factor (for re cent meth o ds on 
adap ti ve estimation o f the infiation factor from the innovation statistics see lAndersonI (120071 . 
2009h : lLi et aP tOO^ ). 



We will address here the following questions: Under what circumstances can reduced stochas- 
tic climate models improve the skill of an ensemble based data assimilation scheme if used 
as forecast models? Furthermore, if they do, why so? 



Harlim and Majdal ( 120081 . 120101 ) studied ensemble filtering on the Lorenz-96 model (ILorenzl . 
19961 ) in the fully turbulent regime, where they found skill improvement for a stochastic 
climate forecast model which was constructed by radically replacing all nonlinear terms by 
linear stochastic processes whose statistics are estimated by fitting the model to the climato- 
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logical variance, the decorrelation time and the observation time correlation. Here we study 
the performance of systematically derived climate models, however our results will also shed 
light on skill improvements seen in other ensemble based stochastic filtering methods. 



Stochastic climate models have often been found to not reproduce the autocorrelation func- 
tion of the f ull determinist i c syst e m well, in particular for sm all time scale separation (see 
for example iFranzke et al.l (|2005[ ): iFranzke and Majdal (120061 )). Here we will see that this 
inaccuracy in faithfully reproducing the statistics of the slow dynamics does not preclude 
stochastic reduced models from being beneficial in data assimilation. We will identify a range 
of observation intervals in which the reduced stochastic climate model actually outperforms 
the full deterministic model. The range of observation intervals for which skill improvement 
is observed will be related to the characteristic time scales of the system. In particular we 
find that the observation interval has to be larger than the typical time taken for the slow 
state to switch regimes, and smaller than the decay rate of the autocorrelation function of 
the slow variable. Hence, stochastic climate models will be beneficial when studying weather 
or climate systems on time scales which resolve regime switches. 



We find that although climate models with appropriately determined drift and diffusion 
terms faithfully reproduce the slow dynamics of a multiscale deterministic model, the statis- 
tics of the climate model is rather sensitive to uncertainties in the drift and diffusion terms. 
A central result will be that their superior performance in data assimilation is not linked 
to their accurate approximation of the statistical slow behaviour but rather to a controlled 
increase of the ensemble spread associated with their inherent dynamic stochasticity, with an 
effect similar to inflating covariances or increasing ensemble size. Hence optimal performance 
of stochastic reduced climate models will be achieved when the diffusion coefficient is larger 
than the value which produces the closest match to the full deterministic dynamics. 

The remainder of the paper is organised as follows: In Section [2] we introduce the model 
and derive the reduced stochastic climate model for the slow dynamics using homogenization 
techniques. Estimation of the drift and diffusion terms in the climate model is performed 
in Section [31 In Section H] we examine some of the relevant characteristic time scales of the 
dynamics. In Section |5] we then study the sensitivity of these time scales to uncertainties 
in the time scale separation parameter as well as in the diffusion and the drift term of the 
climate model. After a brief overview of ensemble Kalman filtering in Section E] we present 
numerical results showing the skill improvement in using the reduced climate model over 
the full deterministic model in an ensemble Kalman filter in Section [3 We conclude with a 
discussion in Section ID 
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2 The model 



W e consider th e follow ing: deterministic system with slow-fast time scale separation proposed 



m 



Givon et al.l (12004 



dx 
It 



dyi 10 , 

dy2 ^ 
dt 

dys 1 8 



:(28?/i - ?/2 - 2/12/3) 



(1) 
(2) 
(3) 
(4) 



Here the slow variable x is driven by a fast chaotic Lorenz system ( iLorenzl . Il963l ) . The slow 
dynamics ([I]) describes an overdamped degree of freedom in a one-dimensional potential well 
V{x) = — which is being continually "kicked" by the fast chaotic motion of the 
Lorenz subsystem ([2])-(j4]). In the following we will set = 0.01 unless specified otherwise. 

In Figured] we show a trajectory of the slow variable x obtained from a simulation of the full 
deterministic system The trajectory of the slow variable appears to randomly switch 

between two metastable states around the minima of the potential V{x) at x* = ±1. In the 
following Section we will derive a reduced stochastic differential equation which describes 
the effective slow dynamics. 




900 1000 



Figure 1: Sample trajectory of the slow variable x of the full deterministic 4D model ([I])-(!4j). 
The dynamics appears to randomly switch between two metastable states with randomly 
distributed sojourn times r^. 
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Derivation of the stochastic chmate model 



The ergodicity of the fast chaotic Lorenz equation ( iTuckerl . Il999l ) and its mixing prop- 
erty (ILuzzatto et al.l . 120051 ) allows for an application of stochastic model reduction tech- 
niques, by which the fast chaotic degrees of freedom are parametrized by a one-dimensional 
stochastic process. This can be heuristically justified provided the fast processes decorre- 
late rapidly enough; then the slow variables experience the sum of uncorrelated fast dy- 
namics during one slow time unit. According to the (weak) Central Limit Theorem this 
converges to approximate Gaussian noise in the limit when the time scale separation be- 
comes infinite. We will formal i ze th i s and apply stochastic singular perturbation t heory 
(homogenization) ( Khasmi nskyl . llQGGt iKurtzl . Il973t IPapanicolaoul . 1 19761 : iGivon et al.l . 12004 : 
Pavliotis and Stuarti . i2008i ) to deduce the following reduced stochastic ID climate model for 
the slow x-dynamics 

with 1-dimensional Wiener process W, where a is given by the total integral of the autocor- 
relation function of the fast y2 variable with 



cr 
Y 



4 
90 



I lim — 



y2{s)y2{t + s)ds} dt 



(6) 



Rather than studying the system ([I])-(11D directly, in stochastic homogenization one con- 
siders the associated Fokker-Planck or its adjoint backward Kolmogorov equation. Whereas 
the original ordinary differential equation is nonlinear, the latter ones are linear partial dif- 
ferential equations and can be treated with standard perturbation techniques, expanding 
in the small parameter e and studying solvability conditions of the linear equations at the 
respective orders of e. The solvability conditions are given by Fredholm alternatives and can 
be evaluated using the ergodicty of the fast process. 



We will analyze the system in the framework of the backward Kolmogorov equa- 

tion for the conditional expectation value of some sufficiently smooth observable (f){x, y) 
defined as 

v{xo, yo, t) = E [0(x(t), ?/(t)) I x(0) = xq, i/(0) = yo] 

with y = {yi, y2, ys). Here the expectation value is taken with respect to the ergodic measure 
induced by the fast dynamics of the chaotic Lorenz system. We drop the 0-subscripts for 
ease of exposition from here on. We study the following Cauchy problem for t G [0, oo) 



-Q^v{x,y,t) 
v{x,y,^) 



Lv{x,y,t) 
<P{x,y) , 



(7) 



with the generator 



L — — Lq H — Li + L2 , 

s'^ e 
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where 



Lo = giy) ■ 
4 d 



(9) 
(10) 



and g{y) denotes the scaled vectorfield of the fast Lorenz system g{y) = (10(?/2 ~yi), (28yi — 
1/2 — yiVzji (l/iZ/2 — 81/3/3)). We remark that equally we could have used the framework of the 
Fokker-Planck eq uation . 

Pioneered by Kurt j ( 1973 ) and Papanicolaou ( 1976 ) a perturbation expansion can be 
made according to 



v{x, y, t) = vo + evi + e^V2 + 



A recent exposition of the theory of homogen ization and their applications is provided in 
(iGivon et al.l . l2004l : iPavliotis and Stuartl . |2008[ ) . Substituting the series (fTTl) into the back- 
ward Kolmogorov equation ([7]) we obtain at lowest order, 0(1/6^), 



Lofo = 



(12) 



The chaotic dy namics of the fast Lorenz system, associated with the generator Lq, is ergodic 
(ITuckerl . Il999l ). Hence the expectation value does not depend on initial conditions, and we 
obtain 

Vo = vo{x,t) 

as the only solution of (IT^ . 

Ergodicity implies the existence of a unique invariant density induced by the fast chaotic 
dynamics given by the unique solution of the associated Fokker Planck-equation 

LoP = , 

where Lq is the formal L2-adjoint of the generator Lq. We label this solution as Pooiv)- 
At the next order, 0(l/e), we obtain 



LqVi = -Livo . 



(13) 



To assure boundedness of Vi (and thereby of the asymptotic expansion ( fTTl) ) a solvability 
condition has to be satisfied prescribed by the Fredholm alternative. Equation ( |T3l) is solvable 
only provided the right-hand-side is in the space orthogonal to the (one-dimensional) null 
space of the adjoint Lq, i.e. if 

4 d 
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where we introduced the average of an observable h{x, y) over the fast ergodic density as 
(^)poc. •= / h{^:y)Poo{y) J- Note that the vanishing of the average of the fast perturbation 
in the slow equation with respect to the invariant measure induced by the Lorenz system 
implies that classical averaging would only produce trivial reduced dynamics x = 0{e). 
Since {y2)p^ = 0, there exists a solution of f|T3l) . which we can formally write as 

vi{x, y, t) = -Lq ^Lit;o + R{x) , (14) 

where R{x) lies in the kernel of Lq. 
At the next order, 0(1), we obtain 

d 

^oV2 = -^vo - LiVi - L2V0 , (15) 

which yields the desired evolution equation for vq as the associated solvability condition 

d_ 

di 

The reduced slow backward Kolmogorov equation is then 

92 



Vq = {L2Vo)poo + {^iMx,t))pao ■ (16) 



|^o = x(l-x^)^.o-(^) (1/2L0W, 



/Poo 

This can be simplified further. For every function h with {h)p^ = we define 



vo . (17) 



H{x,y) = - I e^'^'hdt 



00 



which , upon using that Lq corresponds to an ergodic process, implies that LoH = h (jPavliotis and Stuart 



20081 ) ■ Hence we can evaluate 



(2/2L0 'y2)p.o = - / {y2e^'%)poo dt 



POO -j^ pT 

/ { lim - / y2is)y2{t + s)ds} dt 
Jo T Jo 



where the last identity follows from employing Birkhoff's ergodic theorem. The reduced 
backward Kolmogorov equation (fT7|) can then be written as 

d d a"^ d'^ 

-.o = x(l-x^)-.o + -^.o, (18) 

with 0"^ given by iQ. Note that R{x) does not contribute to the dynamics. We can therefore 
choose R{x) = in order to assure that {v)p^ = {vq)p^ + C(e^). 

The slow reduced Langevin equation associated with the reduced backward Kolmogorov 
equation ( !T8|) is then given by our stochastic ID climate model (|5]). 
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The unique invariant density of the gradient system (|5]) is readily determined as the unique 
stationary solution p{x) of the associated Fokker-Planck equation 

d d fdV \ (9^ 

We find 

p{x) = le-^^(^) (20) 

with 

/■oo 



Note that the unique invariant density of the full deterministic 4D system is now 

approximated by p{x,y) = p{x)pac{y) + 0{e). 



3 Parameter estimation 



The value of the diffusion coefficient a given by ([6]) cannot be determined analytically for 
the reduced stochastic ID climate model as the unique invariant density Pooiy) of the fast 
Lorenz subsystem ©-(jl]) is not explicitly known. We therefore need to evaluate the diffusion 
coefficient a numerically from simulations of the full deterministic 4D model under the 

assumption that its slow dynamics is well approximated by the stochastic ID climate model 
We d o so by coarse-g r aining the full deter r ninistic 4D sys t em an d estimating conditional 



averages (iGardineii . l2003l : ISiegert et all Il998l : IStemler et al.l . 120071 ). This method has been 



used in the meteorological community to study time series in diverse contexts such as synoptic 



li, 


2002: 


Sura. 


2003; 



20021 : iBerned . l2005l : iDitlevsenl . Il999h . This approach also allows us to determine the drift 
term of the stochastic ID climate model, which we write for convenience here as 



dx = d{x)dt + adW . 



(21) 



with drift term d{x) = x{l — a;^). Numerically integrating the full deterministic 4D system 
([T])-(j4]) with a small time step dt, we create a long time series x{ti), i = 0,1, ■ ■ ■ ,n with 
ti = i dt. To estimate the parameters of the Langevin system fl2Tl) we subsample the time 
series x(tj) at a coarse sampling time h ^ dt. We choose h such that during h the finely 
sampled trajectory x{ti) performs roughly 3-4 fast (smooth) undulations induced by the fast 
chaotic dynamics. 

We estimate the diffusion coefficient a (and the drift term d{x)) by partitioning the state 
space into bins Xj of bin size AX. We coarse grain by mapping x into X for x G (X, X + 
AX). To estimate the diffusion coefficient we define the average conditioned on the bins 
[X,X + AX] 

5(X) = i((a;"+^ -x")2) ^ , (22) 



h 



x"e{x,x+AX) 
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where x" = x{nh). The angular brackets denote a conditional ensemble average over mi- 
croscopic realizations x{ti) which fall into a coarse macroscopic bin [X, X + AX]. In the 
limit when the deterministic 4D dynamics can be approximated by the stochastic ID climate 
model fl2T|) and for small values of the sampling time we can evaluate 

S{X) = a^ + X\l-X^fh, 

where we used {dW) = 0. In the limit /i — > we estimate S{X) = cr^. Similarly the drift 
d{x) can be estimated by 

D(X) = i((x"+i-x")) , (23) 

h x"eix,x+AX) 

with D{X) ^ d{x) in the limit /i — )■ 0. 

There are several pitfalls one may encounter using this method to estimate the effective 
drift and diffusion coefficients of chaotic deterministic differential equations related to the 
choice of the sampling time h. (Sampling issues f or purely stochastic multiscale systems 
were already reported in iPavliotis and Stuart is chosen smaller than the typical 



time scales on which the fast subsystem decorrelates, the diffusion coefficient does not exist 
and >S'(X) will be close to zero in the limit h ^ 0. On the contrary, if h is chosen too 
large the estimator for the diffusion coefficient will be swamped by the deterministic part 
of the dynamics and we will obtain S'(X) ^ X^(l — X^)^/i. Similarly there are problems 
with estimating the drift -D(X) for sampling times h taken too large. For such subsampling 
times h, the coarse-grained dynamics will randomly switch from bin to bin according to the 
invariant density p{x) given by (!20|) . Since in our case p{x) is symmetric, we can write the 
continuous approximation of the estimator for the drift as 

D(X) = i J{Y-X)p{Y)dY = -^. 

This may lead to erroneous classifications of stochastic processes as Ornstein-Uhlenbeck 
processes. We remark that if the time series is not sufficiently long, the estimated drift coef- 
ficient in our case of a bimodal probability density function will be erroneously estimated as 
D{X) = —(X — Xo)/h due to the presence of the two metastable states around Xq = ±1. 

This procedure can be readily extended to study multi-dimensional stochastic processes 
where the bins would be hypercubes; the interpretation of the results in higher dimensions 
and the deducing functional dependencies may however be difficult. We rem ark that one 



may also use Kalman filters to estimate the parameters (ICornick et al.l . |2009| ). or estimate 



the generator directly (jCrommelin and Vanden-Eijndenl . |2006| ) 



In Figure [2] we show results from the parameter estimation procedure. We choose a coarse- 
grain bin size of AX = 0.05 and sample the slow variable x at constant sampling intervals 
h = 0.005 corresponding to 4 fast oscillations. We used a trajectory of the full deterministic 
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4D system with = 0.0005 and evolved until T = 3 x 10^ time units. We confirm 

the ID climate model by estimating the drift coefficient clearly as 



D{X) = X{l-X^ 



and estimating a constant diffusion coefficient S{X) = cx^ = 0.113. We note tha t this is 
slightly smaller than the value = 0.126 which was found by iGivon et al.l ( 12004J ) using a 
different method. 

In Figure [2t^c) we show how the diffusion coefficient S{X) depends on the sampling time h 
as discussed above. It is clearly seen that the diffusion coefficient a cannot be unambigously 
determined (at least not using this method). We will investigate in Section |5] how sensitive 
the statistics of the ID climate model is to this degree of uncertainty in the estimation 
of the diffusion coefficient. 



4 Time scales 

An important parameter in data assimilation is the observation interval Atobs- We will see 
in Section [7] that skill improvement of the analysis is dependent on Atobs- We therefore 
now analyze several characteristic time scales of the deterministic 4D toy model (II])- (jl]). In 
particular we will focus on the slow variable x. 

4.1 Autocorrelation time 

We first estimate the decorrelation time Tcott, which is linked to the decay rate of the auto- 
correlation function of the slow variable 

C(r)= hm 1 r x{s)x{t + s) ds . (24) 

The autocorrelation time is estimated as the e-folding time of C(r). The autocorrelation 
function C(r) can be estimated from a long trajectory of x obtained from a simulation of the 
full deterministic 4D model (Il])-(l4]). For small values of r it decays approximately exponen- 
tially with a measured decay rate of 0.00481 corresponding to an e-folding time of Tcorr = 208 
time units. 

4.2 Sojourn and exit times 

As a second time scale we consider the mean sojourn time 
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Figure 2: Parameter estimation for the (a) drift term D{X), and (b) diffusion coefficient 
S{X) from a long trajectory of the full deterministic 4D system with = 0.0005 and 

sample interval h = 0.005. The dashed line depicts the theoretically expected drift from the 
stochastic ID climate model (j5]) in (a), and the constant diffusion coefficient o"^ = 0.113 in 
(b). (c): Diffusion coefficient S{X) as a function of the sampling time h. 
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where the sojourn times measure the times spent in one of the slow metastable states 
around x* = ±1 as depicted in Figure [H We expect this time scale to strongly correlate with 
the autocorrelation time scale Tcorr as it is random transitions between the slow regimes which 
cause decorrelation of the slow variable. We numerically estimate the mean sojourn time 
using two separate methods. Firstly, we estimate the mean sojourn time directly from a long 
trajectory of the slow variables x using fl25|) as f ^ 218.2 time units. Secondly, the succession 
of rapid transitions and relatively long residence times in the potential wells suggests that 
successive jumps between metastable states can be treated as independent random events, 
and that the sojourn times are a Poisson process with cumulative probability distribution 
function 

Pe(rO = l-exp(-^) . (26) 

In Figure [3] we show the empirical ranked histogram Pc{Ti) for the sojourn times measured 
from a long trajectory x{t) of the full deterministic 4D system ([I])-(j4]), allowing us to deter- 
mine the mean sojourn time f ~ 213.8 time units. Calculating the mean sojourn time by 
either the average of individual sojourn times ( l25i) or via the Poisson process approximation 
( l26l) yields results differing by only 2%, indicating that for e'^ = 0.01 the fast chaotic Lorenz 
subsystem has almost fully decorrelated and that the rare transitions between the metastable 
states are approximately a Poisson process. In the following we will use the average of the 
two obtained values of the mean sojourn time and set r = 216.0. 




Figure 3: Log-plot of the normalised histogram of sojourn times Tj of the slow variable x 
in each metastable state around x* = ±1 for the full deterministic 4D system ([l])-(j4]). The 
dashed line shows a least-square fit. Using ( l26i) the mean sojourn time can be estimated via 
the inverse of the slope of the least-square fit as r 213.8. 
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These numerically obtained results for the full deterministic 4D system compare well 

with th e value ca l culate d analytically for the stochastic ID climate model (E]) using Kramer's 
theory (IKramersl . 1 19401 ) to calculate the first exit time 



-T . 



We define the first exit time to be the average time it takes from the potential well at 
X = ±1 to reach the saddle of the potential V{x) at x = 0. To calculate the exit time we 



solve the following Cauchy problem (see for example IZwanzig 



Lclim^e 



(27) 



where 



Lclim ^ dx ~\~ 2 ^xx 

is the generator of the stochastic ID climate model with the boundary conditions 
dxTe{x = ±1) = and Te(0) = 0. Note that the first exit times for the symmetric po- 
tential V out of the potential wells atx = — lora; = l are identical. Upon using the 
boundary conditions, we solve (|27|) to obtain 

with /3 = 2/o"^, which can be numerically evaluated to yield t^. = 117.8 for cr^ = 0.113. The 
exact analytical value calculated from the reduced stochastic ID climate model compares 
well with the numerically estimated values of the first exit times of the full deterministic 4D 
model Te = 108.0, confirming the accuracy of the homogenized model to faithfully reproduce 
the statistics of the full parent model. 



4.3 Transit time 

As a third characteristic time scale, we study the mean transit time which is the average 
over the shortest transit time r^^j between the two slow metastable states around x* = ±1. 
We define this time to be the mean of 

Tt,i = mm{{ti - tj)\xiti) = ±1 A x{t,) = ^1} . (28) 

For a long trajectory containing 5000 transitions between the slow metastable states we 
numerically estimate Tt = 5.90 for the full deterministic 4D model ([l])-(jl]). We found that 
the transit time does not vary greatly with the value of e. 
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As for the mean sojourn time r, the transit time Tt can be a nalytically calculated for the 
stochastic ID climate model (see for example iGardinei] fl2003h ) as 



n+i(o)-n^i(o) 

^¥-i(0) 



dx'p {x') I p_i{z)p{z)dz 



where 

n+i(x) = p+i(x) 

n_i(x) = P-i{x) I dx' p^^{x') I p^i{z)p{z)dz 
with the splitting probabilities 



!ldzp-\ 
llidzp~H 
A dzp-\ 
llidzp-^{ 



(29) 



For the stochastic ID climate model with = 0.113 we evaluate this numerically to be 
Tf = 5.66 which again compares well with the numerically obtained value of the transit time 
for the full deterministic 4D system. 



5 Validity of the stochastic climate model 



In this Section we numerically investigate to what degree the stochastic ID climate model ([5]) 
faithfully reproduces the slow dynamics of the full deterministic 4D system ([I])-(jlD. It is clear 
that such a correspondence can only be of a statistical nature. The close correspondence of 
the time scales of the full deterministic 4D system and of the stochastic ID climate model for 
cr^ = 0.113 and = 0.01 already indicates the validity of the stochastic ID cli naate naodel in 
repr oducing statis t ical a spects of the slow dynamics. The rigorous theory by iKurta (Il973l ) 
and iPapanicolaod (119761 ) assures weak convergence of the stochastic climate model in the 
limit of £ — )■ 0. Here we discuss the effect of finite time scale separation e. Furthermore we 
study the sensitivity of the climate model to uncertainties in the diffusion coefficient as 
we have encountered in Section [31 In the Appendix we will also investigate the sensitivity 
to possible uncertainties in the drift term. 



5.1 Probability density function 

We first examine how the probability density function varies for different values of the time 
scale separation parameter e. In Figure H] we show empirical densities for the slow variable x 
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obtained from a long simulation of the full deterministic 4D system (II]) -dl]) for different values 
of the time scale separation e. The obtained estimates of the empirical invariant densities 
exhibit a bimodal structure reflecting the two slow metastable regimes near x* = ±1. The 
probability density functions are reasonably insensitive to changes in e^, with the maxima 
of the densities at x* = ±1 differing for = 0.0005 and e"^ = 0.01 by less than 1%, and 
by approximately 5% for e'^ = 0.01 and e"^ = 0.1. This numerically demonstrates the weak 
convergence of solutions of the homogenized stochastic ID climate model to solutions of the 
full deterministic slow dynamics. 



0.012 




Figure 4: Normalised empirical densities for the slow variable x of the full deterministic 4D 
model (HD-dH) with = o.l (dashed line), = 0.01 (solid line) and e"^ = 0.0005 (circles). 
The empirical densities were obtained from long trajectories integrated to time T = 10^ time 
units. 

Next we show in Figure [5] how sensitive the empirical density of the stochastic ID climate 
model is to changes in the diffusion coefficient cr^. Unlike for changes in the time scale 
parameter e, the ID climate model is quite sensitive to changes in the diffusion coefficient. 
The invariant density of the full deterministic 4D model is best estimated by the stochastic 
ID climate model with cr^ = 0.113. 

5.2 Time scales 

Differences in the probability density function imply different statistics of the dynamics. In 
particular, we will now examine how the characteristic time scales of the full deterministic 
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Figure 5: Invariant density fl^U]) for the stochastic ID chmate model with cr^ = 0.1 
(circles), = 0.113 (crosses), = 0.126 (squares) and cr^ = 0.15 (triangles), and empirical 
density for the slow variable x of the full deterministic 4D model ([I])-(l4]) with = 0.01 (solid 
line). 

4D model change with the time scale separation e, and how they change for the 

stochastic ID climate model ([5]) with varying diffusion coefficients a^. 

Consistent with our earlier observation that the probability density function does not vary 
greatly with the time scale separation parameter e (provided it is sufficiently small), we con- 
firm here that the time scales are insensitive to a decrease in e for < 0.01. In particular 
the transit time exhibits very little variation with e. 

We find that with the exception of the transit time Tt, the characteristic time scales of Section 
IHare very sensitive to uncertainties in the diffusion coefficient cx^. For example, in Figure [6] 
the autocorrelation function C(r) is shown as estimated from a long trajectory of x obtained 
from a simulation of the full deterministic 4D model and from simulations of the 

stochastic ID climate model (|5]) with different values of the diffusion coefficient o"^. Using 
= 0.113 in ([5]) produces the best fit to the shape of C(r) for the full deterministic 4D 
model (P-dH). 

Table [1] lists the values of the characteristic time scales for the stochastic ID climate model 
(jS]) for various values of the diffusion coefficient cr^. We show the percentage difference of 
each value with that for the full deterministic 4D model with = 0.01 in brackets. 

All time scales vary by approximately a factor of 5 over the range of values indicated except 
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Figure 6: Autocorrelation function C(r) as a function of the time lag r for the full determin- 
istic 4D model ([I])-(l4]) (solid line) and for the stochastic ID climate model ([5]) with cr^ = 0.1 
(circles), = 0.113 (crosses), = 0.126 (squares) and = 0.15 (triangles). 

for the transit time which varies by just 13% and approximates the transit time of the full 
deterministic model better as decreases. The transit time is determined by the rate of 
divergence associated with the unstable fixpoint at a; = of the deterministic drift term, 
making it relatively insensitive to changes in the diffusion coefficient. The transit times of 
the full deterministic model however are slightly overestimated, since the definition ( l28l) does 
not take into account the skewness of the probability density function (see Figure HI) which 
implies that the trajectory spends more time within the range x G [—1, 1] than outside 
this range. The table indicates that all time scales (except the transit time Tj) are best 
approximated by the stochastic ID climate model with cr^ = 0.113, consistent with the 
better approximation of the probability density function for that value of cr^. However, we 
will see in Section [7] that in the context of ensemble data assimilation it is not necessarily 
ideal to use this value for a^; in fact it is preferable to use models with larger diffusion to 
create a more reliable ensemble with superior analysis skill. 

6 Ensemble Kalman filtering 

We briefiy introduce how data assimilation is performed in an ensemble filter framework. 
Given an iV- dimensional dynamical system 

it = F(z,) , (30) 
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Table 1: Characteristic time scales of the stochastic ID climate model ([5]) for different values 
of the diffusion coefficient o"^. In brackets we indicate the error when compared to values 
obtained from the full deterministic 4D model with = 0.01. 





= 0.1 


ct2 = 0.113 


(T^ = 0.126 


= 0.15 


^corr 

n 


353.9 (70.1%) 
205.7 (90.5%) 
5.86 (0.07%) 


221.7 (6.6%) 

117.8 (9.1%) 
5.66 (4.2%) 


129.0 (61.2%) 
75.6 (42.6%) 
5.48 (7.7%) 


70.5 (195%) 
40.8 (164.7%) 
5.17 (14.1%) 



which is observed at discrete times ti = iAtohs, data assimilation aims at producing the best 
estimate of the current state given a typic ally chaotic, po ssibly inaccurate model z = /(z) 
and noisy observations of the true state z^ (iKalnayl . 120021 ). 

Observations yo G M" are expressed as a perturbed truth according to 

yo(ti) = UztiU) + ro, 

where the observation operator H : — )■ M" maps from the whole space into observation 
space, and Fq € M" is assumed to be i.i.d. observational Gaussian noise with associated error 
covariance matrix Rq. 

In an ensemble Kalman filter (EnKF) (lEvensenl . Il994j . |2006| ) an ensemble with k members 

Zfc 

Z = [Z1,Z2,...,Z,.] GM^^'= 

is propagated by the dynamics (1501) according to 

Z = f (Z) , f (Z) = [/(zi), /(Z2), . . . , /(z,)] G M^x'^ . 



(31) 



This forecast ensemble is split into its mean Zf and ensemble deviation matrix Z'j. The 
ensemble deviation matrix Zj can be used to approximate the ensemble forecast error co- 
variance matrix via 



1 „ ^,,..,T 



-z'it) [z'{t)Y e 



i)NxN 



(32) 



Note that P/(t) is rank-deficient for k < N which is the typical situation in numerical 
weather prediction where is of the order of 10^ and k of the order of 100. 

Given the forecast mean z/, the forecast error covariance P/ and an observation an 
analysis mean is produced by minimizing the cost function 



(yo-Hz)^R-^(yo-Hz) 



which penalizes distance from both the forecast mean and the observations with weights 
given by the inverse of the forecast error covariance and the observational noise covariance. 
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respectively. The analysis mean is readily calculated as the critical point of this cost function 
with 

Za = z/ + Ko [yo - Hz/] (33) 



where 



Ko = P/H^(HP^H^ + Ro) ' 
Pa = (I-KoH)P;. 



(34) 



Using the Shermann- Morrison- Woodbury formula (iGolub and Load . Il996l ) we may rewrite 
the Kalman gain matrix and the analysis covariance matrix in the more familiar - though 
computationally more complex - form as Kq = PaH'^R"^ and 



P, = (P7I + H^R-^H) 



-1 



(35) 



To determine an ensemble which is consistent with the analysis error covariance P^, and 
satisfies 



1 



k 



—K K] 



where the prime denotes th e devia t ions f rom the analysis mean, we use the method of 
ensemble square root filters (ISimonl. 120061) . In particular we use the method proposed in 
(iTippett et al.l . l2003t IWang et al.l . 120041 ). the so called ensemble transform Kalman filter 
(ETKF), which seeks a transformation S e M'^^^ such that the analysis deviation ensemble 
Z^ is given as a deterministic perturbation of the forecast ensemble Zj via 



Z' 



Z',S 



(36) 



Alternatively one could choose the ensemble adjustment filter (lAndersonl . l200l[ ) in which the 
ensemble deviation matrix Z'j- is pre-multiplied with an appropriately determined matrix A G 
j^AfxAf j^qIq that the matrix S is not uniquely d etermined for k < N. The transformation 
matrix S can be obtained by ( IWang et al.l . l2004j ) 



S = C(l, + f) ^C^ 
Here CFC^ is the singular value decomposition of 



U 



z'/h^r-^hz', . 



k-1 



f 



The matrix C G M'^'^('^~^) is obtained by erasing the last zero column from C G M'^'^*^, and 
r G mC^-^)^^^-^) is the upper left {k — 1) x {k — 1) block of the diagonal matrix F G R''^''. 
The deletion of the eigenvalue and the associated columns in C assure that Z'^ = Z^S 
and therefore that the analysis mean i s given by z„. Th is method assures that the mean 
is preserved under the transformation (jWang et al.l . |200J) which is not necessarily true for 
general square root filters. We note that the continuous Kalman-Bucy filter can be used to 
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calculate without using an y computations of matri x inverses, which may be advantageous 
in high- dimensional systems ( iBergemann et al.l . 120091 ) . 



A new forecast is then obtained by propagating with the nonhnear forecast model to the 
next time of observation, where a new analysis cycle will be started. 

In the next Section we perform data assimilation for the toy model introduced in Section |5J 
We examine when and why the stochastic ID climate model (E]) can be used as a forecast 
model to improve the analysis skill. In particular, we will relate the range of validity of 
the ID climate model to the relation between the observation interval and the characteristic 
time scales introduced in Section HI 



7 Stochastic climate model as forecast model in data 
assimilation 

We investigate how the ETKF as described in the previous Section performs when either the 
full 4D deterministic system or the reduced stochastic ID climate model (|5]) is used 

as a forecast model, when the truth evolves according to the full deterministic 4D model. 
We are concerned here with the following questions: Can the dimension reduced stochastic 
climate model produce a better analysis than the full deterministic model? And if so, under 
what circumstances and why? 

We generate a truth zt = {xt, yit, y2t, Vst) by integrating ([I])-© using a fourth order Runge- 
Kutta scheme with timestep dt = e^/20. Observations of the slow variables only are then 
generated at equidistant times separated by the observation interval Atobs according to 
Xohs = Xt + r] where 7] ~ M{0, VRohs) with Rohs = 0.5 a'^ and = 0.126. Note that this 
value is slightly larger than the optimal value cr^ = 0.113 which was found in the previous 
section to best approximate the full deterministic 4D model. 

To integrate the forecast model we use a fourth order Runge-Kutta scheme with timestep 
dt = e^/20 for the full deterministic 4D model with e"^ = 0.01, and an Euler-Maruyama 
scheme for the reduced stochastic ID climate model using the same timestep. Unless stated 
otherwise we use = 0.126 for the stochastic ID climate model. We include a spin- up 
time of 100 analysis cycles. We perform twin experiments, where both forecast models are 
started from the same initial conditions, and are then subsequently assimilated using the 
same truth and observations over 5000 time units. In order to avoid rare occasions of filter 
divergenc e due to an underestimation of the forecast covariances we employ a 2% variance 



inflation (lAnderson and Anderson! . 119991 ) . 



Figure [7] shows two sample analysis time series created by assimilating the same truth and 
observations at regularly spaced intervals of Atobs = 50 where the stochastic ID climate 
model and the full deterministic 4D model are used as the respective forecast models. Both 
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forecast models track the truth welL The stochastic ID chmate model however tracks the 
transitions between the slow metastable states around x* = ±1 more accurately. 




Figure 7: Sample truth (solid line) and ETKF analyses (dashed line) for the observed x- 
component using the full deterministic 4D model (II])- (H]) (top panel) and the reduced stochas- 
tic ID climate model ([5]) (bottom panel) as the forecast model. The crosses are observations 
with error variance i?obs = O.Sa^ (cr^ = 0.126). Here we have used an observation interval 
Atobs = 50 and used A; = 15 ensemble members. 



We quantify the improvement made using the reduced stochastic ID climate model by mea- 
suring the RMS error £ between the observed truth xt and the analysis mean x^, 



i=l 

where N is the total number of analysis cycles. In Figure |H] we show £ as averaged over 
1000 realizations, where we have used both the full deterministic 4D model and the reduced 
stochastic ID climate model as forecast models, as a function of the observation interval 
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Aiobs- Both filters exhibit the same accuracy for small observation intervals Atobs < 10, 
when the system is frequently observed, and for very large observational times Atobs > 500, 
when the forecast is much larger than all the characteristic time scales we discussed in Sec- 
tion |H At very large observation intervals (not shown), when the ensemble of both forecast 
models will have explored the state space of the slow variable with climatological variance 
'^ciim ~ 1' resulting analysis error covariance can be estimated using the Kirchoff-type 
addition rule of covariances as = a^;^ + R~^^ as Pa = 0.244^ for i?obs = O.Sa^ and 
0"^ = 0.126, which corresponds well with the measured asymptote of the RMS error S in 
Figure [HI Interestingly, for moderate observation intervals Atobs the stochastic ID climate 
model performs considerably better than the full deterministic 4D model. Moreover, the 
analysis using the full deterministic 4D forecast model exhibits RMS errors worse than the 
observational error of y/Rohs = 0.251. In contrast, when the stochastic ID climate model is 
used as a forecast model for the filter, the analysis error is always below the observational 
error, i.e. performing data assimilation is desirable over just trusting the observations (albeit 
with an improvement in RMS error of only around 5%). 




Figure 8: RMS errors S of the analysis using the full deterministic 4D model (II])- dl]) (crosses) 
and the stochastic ID climate model ([5]) with cr^ = 0.126 (circles) as forecast models, as a 
function of the observation interval Atobs- The horizontal dashed line indicates the obser- 
vational error, y/R~^^ = a/V2 = 0.251 {a^ = 0.126). The results are averaged over 1000 
realisations. 

We introduce the proportional improvement in the RMS error S or skill 

JO _ ^^fuU 
~ F ' 

'^clim 
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where £^fuii and £^ciim are the RMS errors when the full deterministic or the reduced stochastic 
model is used as forecast model, respectively. Values of iS > 1 imply that it is beneficial to 
use the reduced stochastic ID climate model (note that when iS = 1 it is still beneficial to use 
the stochastic ID climate model because of the reduced computational expense). Figure [9] 
shows 5 as a function of the observation interval Atobs? averaged again over 1000 realiza- 
tions, with the time scales r^, Tg and Tcorr superimposed. Skill improvements occur only for 
observation intervals larger than the mean transit time r^, with the maximum skill improve- 
ment over all forecasts occurring for Atobs slightly smaller than the mean exit time Tg. If 
forecasts are longer than Tcorr, the full and climate models are essentially indistinguishable 
with iS = 1. 

Figure |9] also shows how the skill S is distributed over the whole set of the analyses in- 
cluding metastable states and transitions between them, as well as only over those analyses 
which remain within a metastable state near either = — 1 or x* = 1, and those which fall 
in the transitions between the metastable states. Note that for large observation intervals 
Atobs > Tcorr half of all forecasts switch between regimes, so the notion of analyses which 
remain within one metastable regime becomes obsolete. As such, we do not plot skill as 
calculated over the metastable regimes for r > Tcorr in this case. 

Figure in] illustrates that the major contribution to the skill improvement is from the stochas- 
tic ID climate model better detecting the transitions between slow metastable states. While 
there is actually a minor degradation in skill of around 3% for forecasts with Tt < Atobs < Tcorr 
made for analyses which remain within metastable states, skill improvements of over 25% are 
made for those analyses which fall into the transitions between the metastable states. This is 
because analyses which detect transitions between metastable states incorrectly contribute 
large RMS errors compared to the small errors (0(i?obs)) made by incorrect analyses 

within the metastable states. These large errors have a major impact upon the total skill S. 
There is also skill improvement as calculated over the transition periods for small values of 
the observation interval Atobs- However, there is no corresponding overall skill improvement 
since for small observation intervals the number of instances where the truth remains within 
a metastable regime is overwhelmingly larger, thereby swamping the skill improvement ob- 
tained over the rare transitions. 

We note that the skill S is insensitive to the observational noise level, provided -Robs is 
sufficiently small so that observations do not misrepresent the actual metastable state or 
regime in which the truth resides. The skill S is roughly the same for i?obs = O.lcr^ and -Robs = 
0.25cr^, and we observe no skill improvement S ^ 1 when we take very poor observations 
with -Robs = 0.5, approximately half the climatological variance of the full slow system. 

7.1 Sensitivity of analysis skill to uncertainties in the diffusion 
coefficient 

As for the time scales in Section El we examine how the analysis skill S depends on the 
diffusion coefficient o"^ (see the Appendix for dependence on the drift term). Figure [TD] 
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Figure 9: Proportional skill improvement S of the stochastic ID climate model ([5]) over the 
full deterministic 4D model as a function of the observation interval Atobs- We show 

S as calculated over all analyses (circles), the metastable regimes only (crosses), and the 
transitions only (squares). Dashed vertical lines indicate the characteristic time scales of the 
slow dynamics: the decorrelation time Tcow, first exit time and transit time Tt. Parameters 
are as in Figure El 
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shows the skill iS as a function of the observation interval Atobs for several values of cr^. For 
very small diffusion and for very large diffusion (not shown) the skill is actually smaller than 
1 for all observation intervals, implying that the stochastic ID climate model performs worse 
than the full deterministic 4D model. There exists a range of values of o"^ for which the 
stochastic climate model outperforms the full deterministic model with roughly similar skill 
values 5(Atobs) > 1 for n < Atobs < ^-corr- 

The dependence of analysis skill upon the diffusion coefficient can be understood as follows: 
for small diffusion (a^ = 0.1) the stochastic ID climate model is less diffusive than the 
truth and the deterministic 4D forecast model, and its associated forecast is likely to remain 
in a metastable state. This produces small ensemble spread and therefore causes more 
instances of the forecast not detecting a transition between regimes. On the other extreme, 
for large diffusion cr^ ^ 0.15, the stochastic ID climate model exhibits regime switches too 
frequently, thereby producing a large forecast error which contaminates the analysis. For 
moderate values of the diffusion coefficient, the increase in spread of the forecast ensemble 
helps the detection of regime switches. 



1.15 
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Figure 10: Proportional skill improvement S of the stochastic ID climate model ([5]) over 
the full deterministic 4D model (II])- (H]) as a function of the observation interval Atobs with 
0"^ = 0.1 (circles), cr^ = 0.113 (crosses), cr^ = 0.126 (squares) and cr^ = 0.15 (triangles). All 
other parameters are as in Figure [H] 
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7.2 Effect of ensemble size and covariance inflation 

The results from the previous subsection suggest that the observed skill improvement is 
linked to the increased forecast ensemble spread of the stochastic climate model, rather than 
to the accuracy of the stochastic parametrization. This is exemplified by the fact that for 
(T^ = 0.113 for which the stochastic ID climate model approximates the statistics of the full 
deterministic 4D model best, there is no skill improvement 5 ~ 1, and skill improvement 
5 > 1 is given for o"^ > 0.113 where transitions between the metastable states are better 
captured. 



The lack of sufficient ensemble spread when using the full determinist ic system as a forecast 
model is caused by the finite size of the ensemble (lEhrendorferl . 120071 ). In the data assimila- 
tion community a common appr oach to account for insufficient spread is covariance inflation 
([Anderson and Andersonl . Il999l ). whereby the forecast variance P/ is artiflcially inflated by 
multiplication with a factor 6 > 1. We now investigate how the ensemble size k and inflation 
factor 6 affect the skill. In particular we show that the RMS error S using the full deter- 
ministic model can be reduced by either increasing the ensemble size k or by using larger 
inflation factors 6. 



Figure [TT] shows as a function of the ensemble size k for Atobs = 40 (which is close to the 
observation interval yielding maximal skill improvement S; see Figure [9]), o"^ = 0.126, and 
averaged over 200 realizations with = 0.126. For large ensembles with A; > 35 the overdif- 
fusive stochastic ID climate model and the full deterministic 4D model perform equally well 
as forecast models. We have checked that when using a large ensemble {k = 50) the stochas- 
tic climate and full deterministic models perform equally well for the range of observation 
intervals used in Figure [H For smaller ensemble sizes k the stochastic ID climate model 
outperforms the full deterministic 4D model. This illustrates that the skill improvement is 
linked to the insufficient ensemble spread of the deterministic 4D model which is compen- 
sated in the stochastic ID climate model by a slightly increased diffusion coefficient. 

Similarly, for fixed ensemble size k, the ensemble spread can be artificially increased by 
infiating the forecast covariance with a constant factor 6. Figure [12] shows £^ as a function 
of the infiation factor S, for 1 < 5 < 3. It is seen that for unrealistically large values of 6 
the RMS error decreases when the full deterministic 4D model is used as a forecast model, 
whereas it is relatively insensitive to changes in 6 for the stochastic ID climate model. For 
even larger values of 6 the RMS errors for the two forecast models eventually become equal 
(not shown). 



7.3 Reliability and ranked probability diagrams 

Another way of understanding why the ensemble variance of the stochastic ID climate model 
produces skill better than that of the full deterministi c 4D mode l is through ranked proba- 
bility histograms (also known as Talagrand diagrams) (lAndersoru . Il996l : iHamill and Coluccil . 
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Figure 11: RMS error S as a. function of ensemble size k, using the full deterministic 4D 
model (crosses) and the stochastic ID climate model with cr^ = 0.126 (circles) as 

forecast models. We used Atobs = 40; all other parameters as in Figure [HI 




Figure 12: RMS error £^ as a function of inflation factor 6, using the full deterministic 4D 
model ([I])-(|1D (crosses) and the stochastic ID climate model ([5]) with cr^ = 0.126 (circles) as 
forecast models. We used Atobs = 40; all other parameters are as in Figure [HI 
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19971 : iTalagrand et al.l . 119971 ) . To create probability density histograms we sort the forecast 
ensemble = [xf^i,Xf^2, ■■■,Xf^k] and create bins (— oo,x/^i], (x/_i,x/^2], ••• , (a;/,fc,C)o) at 
each forecast step. We then increment whichever bin the truth falls into at each forecast 
step to produce a histogram of probabilities Pi of the truth being in bin i. A reliable en- 
semble is considered to be one where the truth and the ensemble members can be viewed 
as drawn from the same distribution. A flat probability density histogram therefore is seen 
as indicating a reliable ensemble for which each ensemble member has equal probability of 
being nearest to the truth. A convex probability density histogram indicates a lack of spread 
of the ensemble, while a conca ve diagram i ndicates an ensemble which exhibits too large 
spread. We direct the reader to IWilkd (120061 ) for a detailed discussion. 



Hamilll (l200ll ) suggested that a single ranked probability histogram is not sufficient for dy- 
namical systems with several dynamical regimes as in our case, but that in fact one must look 
at the variability of the ensemble in the different dynamical regions individually. We there- 
fore construct ranked probability histograms (for both forecast models) over the metastable 
regime with characteristic time f only and over the complementary regime of transitions 
with characteristic time only, as done for the distribution of the skill over those regimes 
(see Figure [2]). 



In Figure [131 plot probability density histograms of the forecast for the full deterministic 
4D model ([I])-(11D and the stochastic ID climate model ([5]) with various values of the diffusion 
coefficient a^, (a) over all analyses, (b) over the analyses which fall near the metastable states 
X* = ±1 and (c) over the analyses which fall in the transitions between metastable states. 
We use a long analysis cycle containing 250, 000 forecasts. Although the full deterministic 
model produces the most reliable ensemble when averaged over all forecasts, it overestimates 
the variance in the metastable regions and underestimates the variance across the transitions. 
For the stochastic ID climate model, the more underdiffusive model with = 0.1 produces 
the most reliable ensemble in the metastable regions while the more diffusive models with 
(T^ = 0.126 and cx^ = 0.15 both produce more uniform ranked probability histograms than 
the full model does in the transition regions. The reduced stochastic ID climate model with 
optimal diffusion coefficient cr^ = 0.113 behaves similarly to the full deterministic 4D model 
as expected. It is thus more desirable to favour the more diffusive stochastic climate models 
because a comparatively large improvement in RMS error can be made in the transition 
regions compared to in the metastable regions. This agrees with our earlier observation that 
the skill improvement is entirely due to the stochastic climate model being more accurate 
in the transition regions where large errors can be accrued by misclassifying the regime (see 
Figure [9]) . 
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Figure 13: Ranked probability histograms for the full deterministic 4D model (dashed 
line) and the stochastic ID climate model ([5]) with = 0.1 (circles), = 0.113 (crosses), 
(T^ = 0.126 (squares) and cr^ = 0.15 (triangles) over all forecasts (a), as well as only those 
forecasts which fall near the metastable states x* = ±1 (b) and those which fall in the 
transitions between metastable states (c). 
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8 Discussion 



We have investigated a homogenized stochastic chmate model of a bistable multiscale system 
with a chaotic fast subsystem, and its usage as a forecast model in ensemble Kalman filter- 
ing. The reduced stochastic climate model was numerically shown to faithfully reproduce 
the statistics of the slow dynamics of the full deterministic model even in the case of finite 
time scale separation where the theorems underpinning homogenization fail. Homogenized 
stochastic climate models replace in a controlled fashion the accumulative effect of the fast 
chaotic dynamics on the slow dynamics by white noise. We estimated the analytical expres- 
sions of the drift and diffusion terms of the reduced stochastic model using coarse-graining 
and quadratic variations, and found that the diffusion coefficient is very sensitive to the ap- 
plied undersampling time. The statistics of the stochastic climate model, i.e. the probability 
density function and averaged characteristic time scales, were found to be very sensitive to 
the diffusion and drift terms used. Despite this sensitivity and the implied uncertainties asso- 
ciated with the "correct" stochastic climate model, we found that stochastic climate models 
can be beneficial as forecast models for data assimilation in an ensemble filter setting. We 
have shown numerically that using such a stochastic climate model has the advantage of 1.) 
being computationally more efficient to run due to the reduced dimensionality and avoidance 
of stiff dynamics, and 2.) producing a superior analysis mean for a range of observation in- 
tervals Atobs- These skill improvements occur for observation intervals larger than the time 
taken to switch between slow metastable states but less than the decorrelation time Tcon-- 

This skill improvement is due to the associated larger ensemble variance of the stochastic 
climate model, which allows the ensemble to explore the full state space of the slow variable 
better than the full deterministic model does. Forecasts made by the stochastic climate 
model will generally contain more ensemble members which cross the potential barrier be- 
tween the slow metastable states (weather or climate regimes) than full model forecasts will. 
Consequently, there is a greater chance during the forecast of one or more climate model 
ensemble members crossing the potential barrier to the opposite metastable state than in 
the full deterministic model ensemble. If an observation appears near the other metastable 
state from where the forecast began, the Kalman filter analysis equation (|33|) trusts the ob- 
servation more than the forecast, correctly producing an analysis which changes between the 
regimes. Thus the stochastic climate model with large enough diffusion better detects tran- 
sitions between slow metastable states and so has lower RMS error near these transitions in 
which large errors can be accrued. We remark that strict time scale separation is not neces- 
sary for the existence of metastable regimes; we have checked that for e = 1 similar dynamics 
as depicted in Figure [1] can be observed for increased coupling strength. In this case we can 
still obtain skill for observation intervals between the transit time and the decorrelation time 
when using a stochastic Langevin equation of the form albeit with decreased diffusion 
whose dynamically optimal value is no longer approximated by homogenization. 

Similarly, we remark that multimodal probability den sity functions are not necessary fo r 



the existence of metastable regimes, as was pointed out by lWirthI (120011 ) : lMajda et al.l ( 120061 ). 
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In cyclostationary systems, i.e. systems which involve time-periodic coefficients, or in aperi- 
odic systems in which "regimes" occur intermittently, the probability density function may 
be unimodal whereas if restricted to time windows focussing on the regimes the "hidden" 
regimes can b e identified. This has i mpor t ant im plications for interpreting atmospheric data 



(iMajda et al.l . l2006l : iFranzke et al.l . l2008l . |2009[ ) where one needs to reconcile the apparent 



paradox of multimodal probability dens ity functions of pl a netary waves obtained frona (rel- 



atively short-time ) observational data ( iKimoto and Ghill . Il993t ICheng and Wallace! . Il993 



Corti et al. . 1999[ ) and unimodal but non-Gaussian probability density functions obtained 



from long-time simulations of atmospheric general circulation models ( IKond rashoy et al. . 



2OO4J : iBranstator and Bernerl . l2005l : iFranzke and Majdal . l2006l : lBerner and B ranstatorl. l2007f) 



(for a c ritical account on t he an a lysis of short clim atic data see for example iHsu and Zwiers 
fl2nnih : Ist^henson et all tm4} : lA^baunJ toO^ ). 



However, it is pertinent to mention that the results found here are not dependent on the 
existence of metastable states, but only require d ynamics with rapid l arge amplit ude excur- 
sions as for example in intermittent systems. In iHarlim and Majdal (120081 12OIOI ) a system 
without multiple equilibria was investigated and superior filter performance was found for 
stochastic parametrizations (in this case by radically replacing all nonlinear terms by linear 
Ornstein-Uhlenbeck processes in Fourier space). We believe that their observed increased 
analysis skill can be explained by the increased ensemble spread counteracting finite sam- 
pling errors of the underlying ensemble Kalman filter approach as we did. The advantage 
when homogenization can be used is that the slow dynamics and its mean are still well 
resolved and reproduced and therefore the ensemble spread increasing stochasticity is of a 
more controlled nature. The use of homogenized stochastic climate models with adjusted 
larger diffusivity therefore amounts to incorporating deliberate but controlled model error 
into the data assimilation scheme. 



We found that the same skill can be obtained using the full deterministic model either by us- 
ing a larger ensemble, or by increasing the covariance inflation factor. The ffist is undesirable 
for large models because of the computational difficulty of simulating large ensembles and 
their covariances, while the second is undesirable because it introduces unphysical inflation 
(here a multiplicative factor of 5 > 2 was needed) and requires expensive empirical tuning 
of the inflation factor. 



Using stochastic reduced models for multi-scale systems has computational advantages. First 
of all, the dimensional reduction allows for a a gain in computational speed. Furthermore, by 
replacing a stiff ordinary differential equation by a non-stiff stochastic differential equation 
means that coarser integration time steps can be used which otherwise would lead to a break 
down when simulating the stiff multi-scale system. 

Homogenized stochastic climate models therefore provide a cheap way of simulating large 
ensembles, and a more natural way of incorporating forecast covariance inflation into the data 
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assimilation algorithm. In particular, we find that the stochastic climate model produces a 
more reliable ensemble (as characterised by the ranked probability histogram) which better 
detects the transitions between slow metastable states. This leads to a far superior analysis 
in these regimes than that produced by the full deterministic model. 
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Appendix: Sensitivity of the stochastic climate model 
to uncertainties in the drift term 

In this appendix we examine how the statistics of the stochastic ID climate model 
depends on uncertainties in the estimation of the drift term 

d{X) = ax{b - x^) . 

The parameter b controls the location of the metastable states near x* = ±\/b and their 
separation. The height of the potential barrier AV{x) = is controlled by a and b. 

In Figure [M] we show the invariant density (!20l) for the stochastic ID climate model with 
(T^ = 0.126 and several combinations of the drift term parameters a and b. The invariant 
density is more sensitive to changes in b than in a (not shown) since b simultaneously affects 
the distance of the minima of the potential and the height of the potential barrier. The lo- 
cation of the maxima of the probability density function changes by approximately 22% and 
the actual values of the maxima change by approximately 27% over the range of b values 
shown. Varying a produces similar changes in p{x), but as expected the locations of the 
maxima of the probability density function are not shifted. 

As for the diffusion coefficient, differences in the approximation of the empirical density 
imply a sensitivity of the statistics to changes in the drift term parameters. In Table [2] we 
show how the characteristic time scales change for varying drift term parameters. Note that 
now the transit time varies because by varying a we modify the rate of divergence of the 
unstable fixpoint. 

We now investigate how uncertainties in the drift term parameters a and b may affect the 
skill S in an ETKF data assimilation procedure. Figure [15] shows skill curves for fixed 
value of the diffusion coefficient = 0.126, for different combinations of a and b. As with 
the characteristic time scales, the skill S is more sensitive to changes in b than in a. The 
sensitivity to changes in the drift term is of a similar nature to the one for diffusion. Whereas 
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Figure 14: Invariant density f l20|) for the stochastic ID chmate model ([5]), with cr^ = 0.126 
and (a, 6) = (1,1) (sohd hne), (a, 6) = (1,0.8) (circles) and (a, 6) = (1,1.2) (crosses). 



Table 2: Characteristic timescales of the stochastic ID climate model ([5]) with cr^ = 0.126 
for different values of the drift coefficients a and h. 





a = 0.8 


a = 1.2 


a = 1 


a = 1 




6 = 1 


6= 1 


b = 0.8 


6 = 1.2 


^corr 


77.3 


256.3 


41.1 


645.7 


Te 


43.7 


136.1 


31.1 


212.9 


Tt 


6.4 


4.8 


7.2 


4.5 
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decreasing a or 6 by 20% produces little change in skill, increasing a or 6 by 20% makes the 
stochastic ID climate model again less skilful than the full deterministic 4D model. This 
is readily explained by noticing that, increasing b increases the difference between the two 
metastable states and also the height of the potential barrier; therefore, for fixed diffusion 
0"^, this inhibits transitions between the metastable states, thereby decreasing the spread of 
the ensemble. Similarly, decreasing a (while keeping b fixed) increases the potential barrier, 
inhibiting transitions and thereby decreasing the ensemble spread. 

As for the diffusion coefficient a^, the values of the parameters of the drift term which 
are closest to the actual values a = b = 1 for which the stochastic climate model best 
approximates the slow dynamics of the full deterministic 4D model is not the optimal value in 
terms of skill improvement. Values which increase the ensemble spread are more favourable. 



1.2 




10° 10^ A , 10^ 10^ 

^^obs 



Figure 15: Proportional skill improvement S of the stochastic ID climate model ([5]) over the 
full deterministic 4D model (II])- dl]) as a function of the observation interval Atobs for cr^ = 
0.126, with (a, 6) = (1,1) (black solid line), (a, 6) = (0.8,1) (solid, circles), (a, 6) = (1.2,1) 
(solid, crosses), (a, 6) = (1,0.8) (dashed, circles) and (a, &) = (1,1-2) (dashed, crosses). All 
other parameters are as in Figure [81 
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